Biostatistics For Dummies, 2nd Edition (Monika Wahi, John Pezzullo)

316 PART 6 Analyzing Survival Data

start of improvement or remission, date of relapse, or others. For events, you

should record date of each event if it recurs, and even if death is not the event of

interest, date of death should be recorded if available. For censoring purposes,

ensure that you are collecting dates of contact so you can identify a last-seen date

if needed. If you collect your data properly, you will later be able to calculate any

time interval needed, as well as create an event status indicator needed.

Dates and times should be recorded to suitable precision. If your study timeline

is years, it’s best to keep track of dates to the day. In a Phase I clinical trial

(see Chapter 5), participants may be studied for events that happen in a span

of a few days. In those cases, it’s important to record dates and times to the near-

est hour or minute. You can even envision laboratory studies of intracellular

events where time would have to be recorded with millisecond — or even

microsecond — precision!

Dates and times can be stored in different ways in different statistical software (as

well as Microsoft Excel). Designating columns as being in date format or time for-

mat can allow you to perform calendar arithmetic, allowing you to obtain time

intervals by subtracting one date from another.

Miscoding censoring information

It can be surprisingly easy to miscode the event status indicator. If the name of the

variable is Death, and is coded as 1 if the participant died during the observation

period and 0 if they were censored, this seems intuitive. But analysts may want to

identify all the censored observations in their data, so they may create a censored

indicator named Censored, and code it as 1 if the participant is censored, and 0 if

they are not. Because data may be used for different types of survival analyses,

there could be other event indicators included in the data as well also coded

as 1 and 0.

The problem is that if you accidentally use your censored indicator instead of your

event indicator when running your survival analysis, you will unknowingly flip

your analysis, and you won’t get any warning or error message from the program.

You’ll only get incorrect results. Worse, depending on how many censored and

uncensored observations you have, the survival curve may also not hint at any

errors. It may look like a perfectly reasonable survival curve for your data, even

though it’s completely wrong.

You have to read your software’s documentation carefully to make sure you code

your event variable correctly. Also, you should always check the program’s output

for the number of censored and uncensored observations and compare them to the

known count of censored and uncensored participants in your data file.